Method and system that optimizes mean process performance and process robustness

ABSTRACT

A method, system and computer readable medium are disclosed. The method, system and computer readable medium comprises providing a mathematically linked multi-step process for simultaneously determining operating conditions of a system or process that will result in optimum performance in both mean performance requirements and system robustness requirements. In a method and system in accordance with the present embodiment, the steps can be applied to any data array that contains the two coordinated data elements defined previously (independent variables and response variables), and for which a response prediction model can be derived that relates the two elements. The steps are applied to each row of the data array, and result in a predicted C p  response data set. In the preferred embodiment the array is a statistically rigorous designed experiment from which mean performance prediction models can be derived for each response evaluated. The data are applied to each row of the data array.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No. 11/434,043, filed May 15, 2006, entitled “Method and System that Optimizes Mean Process Performance and Process Robustness”, which is incorporated herein by reference.

FIELD OF THE INVENTION

The present embodiment relates generally to research, development and engineering studies and more particularly to providing optimum requirements for a designed experiment for such studies.

BACKGROUND OF THE INVENTION

Research, development, and engineering studies typically provide data sets that contain two coordinated data elements:

-   -   One or more independent variables (study factors) at two or more         levels.     -   One or more associated quality attributes or performance         characteristics (response variables) obtained at each level or         level combination of the study factors.

Numerical analysis techniques can be applied to such coordinated data to obtain a prediction model (equation) for each response. The word prediction indicates that the model predicts the response that would be obtained at a given input level of each included study factor. Linear regression analysis is one example of a numerical analysis approach that can yield a prediction model. The simplest form of such a model is the equation of a straight line presented in Equation 1.

Ŷ=mX+b  Equation 1

In Equation 1 Ŷ is the predicted response obtained from the model, m is the slope of the prediction line, X is an input level of the study factor, and b is a constant corresponding to the Y-intercept. The linear model presented in Equation 1 can be expanded to include any number of study factors and factor effects (e.g. interaction effects, simple curvature effects, nonlinear effects, etc.). A quadratic model that includes simple pairwise interaction effects (X_(i)*X_(j), i≠j) and simple curvature effects (X_(i) ²) of two study factors (X1, X2) is presented in Equation 2. The quadratic model is the model underlying most commonly used statistical optimization experiment designs, also referred to as response surface designs.

Ŷ=β ₀+β₁ X ₁+β₂ X ₂+β₁₂ X ₁ X ₂+β₁₁ X ₁ ²+β₂₂ X ₂ ²  Equation 2

The models obtained from analysis of research, development, and engineering experiments are used to obtain predicted responses at given input combinations of the study factor level settings. In this way the user can identify the study factor level settings that will most improve the responses corresponding to the prediction models. These studies are typically undertaken to meet one of two overarching improvement goals for each response:

Mean performance goal—achieve a specific target value of the response, or a response that exceeds some minimum requirement.

Process robustness goal—minimize variation in response quality attributes or performance characteristics over time.

A statistically designed experiment is the most rigorous and correct approach to obtaining accurate response prediction models from which the study factor level settings that most improve the response(s) can be determined. To illustrate, consider a batch drug synthesis process that consists of three process parameters (study factors): the process operating time per batch (run time), the mixing speed of the rotor in the material blending tank (stir rate), and the temperature maintained in the blending tank (mixing temp.). These process parameters are presented in Table 1, along with their current operating setpoint levels and appropriate study ranges for a statistical optimization experiment.

TABLE 1 Process Control Parameter Current Process Experiment Range (Study Factor) Operating Setpoint Around Setpoint Run time (minutes) 40 30-50 Stir rate (rpm) 14  8-20 Mixing temp. (Deg. C.) 50 40-60

For this process the critical response is the measured amount of drug produced (% Yield). To quantitatively define the effects of changes to the three process parameters on the % yield response requires conducting a statistically designed experiment in which the parameters are varied through their ranges in a controlled fashion. A typical response surface experiment to study the three process parameters would require 17 experiment runs—eight runs that collectively represent lower bound and upper bound level setting combinations, six runs that collectively represent combinations of the lower bound and upper bound of one variable with the midpoints of the other two variables, and three repeat runs of the current process operating setpoints (range midpoints). This experiment design is shown in Table 2 (in non-random order).

TABLE 2 Experiment Run Time Stir Rate Mixing Temp. Run (minutes) (rpm) (Deg. C.) Run 1 30 8 40 Run 2 50 8 40 Run 3 30 20 40 Run 4 50 20 40 Run 5 30 8 60 Run 6 50 8 60 Run 7 30 20 60 Run 8 50 20 60 Run 9 30 14 50 Run 10 50 14 50 Run 11 40 8 50 Run 12 40 20 50 Run 13 40 14 40 Run 14 40 14 60 Run 15 40 14 50 Run 16 40 14 50 Run 17 40 14 50

Assuming reasonably good data, analysis of the response surface experiment results can provide accurate quadratic prediction models for each response evaluated in the study. These models can then be used to identify the study factor level settings that will most improve the responses.

Process Robustness Goal

A statistically designed experiment is the most rigorous and correct approach to accurately defining the process robustness associated with any given measured response. To illustrate, consider again the batch drug synthesis process described previously. The process parameters are again presented in Table 3, along with their current operating setpoint levels and study ranges. However, note that in this case each study range is defined by the variation around the parameter's setpoint expected during normal operation (random error range).

TABLE 3 Process Control Parameter Current Process Expected Variation (Study Factor) Operating Setpoint Around Setpoint Run time (minutes) 40 38-42 Stir rate (rpm) 14 13.5-14.5 Mixing temperature (Deg. C.) 50 49.5-50.5

To quantitatively define process robustness for the % yield response at the defined parameter setpoints (current operating setpoints in this case) requires conducting a statistically designed experiment in which the parameters are varied through their error ranges in a controlled fashion. A typical statistical experiment to define robustness for the three parameters would require 11 experiment runs—eight runs that collectively represent all combinations of the error range lower and upper bound settings, and three repeat runs of the current process operating setpoints. This experiment design is shown in Table 4 (in non-random order).

TABLE 4 Experiment Run Time Stir Rate Mixing Temp. Run (minutes) (rpm) (Deg. C.) Run 1 38 13.5 49.5 Run 2 42 13.5 49.5 Run 3 38 14.5 49.5 Run 4 42 14.5 49.5 Run 5 38 13.5 50.5 Run 6 42 13.5 50.5 Run 7 38 14.5 50.5 Run 8 42 14.5 50.5 Run 9 40 14 50 Run 10 40 14 50 Run 11 40 14 50

Again assuming reasonably good data, analysis of the robustness experiment can define the effect on the responses evaluated of each of the process parameters studied, individually and in combination. The magnitude of their cumulative effects on the response is an indirect indication of the process robustness at the one defined parameter setpoint combination.

It must be understood that analysis of the % yield data obtained from the statistically designed robustness experiment presented in Table 4 can only define the process robustness for % yield associated with the current setpoint level setting combination of the process parameters. There are two critical limitations in the information available from the experiment:

1. The experiment ranges of the variables are those expected due to random error. The ranges are therefore too small to provide the data from which an accurate prediction model of the % yield response can be developed. This same problem is inherent in the analysis of historical data sets, as will be discussed shortly.

2. The data obtained from the designed experiment can not be used to predict what the process robustness for % yield will be at any other level setting combination of the process parameters.

To address the process robustness goal, historical data are sometimes used as an alternative to a designed experiment. However, this approach is extremely unlikely to provide an acceptable result due to the statistical inadequacy of most historical data sets. Put simply, quantitatively defining robustness requires development of accurate response prediction models. Historical data are obtained from monitoring process operation and output over time. In these data the changes to the process operating parameters during process operation are not done in a controlled fashion. Instead, the changes are due to random variation in the process parameters about their setpoints (random error). It is normally not possible to obtain accurate prediction models from such data sets due to two fundamental flaws:

1. The response variation is due to random error variation in the process parameters. Therefore the magnitudes of the response changes are small—normally in the range of measurement error, and so can not be accurately modeled. This condition is referred to as low signal-to-noise ratio.

2. The process parameters are varied in an uncontrolled fashion. This normally results in data sets in which the process parameters are not represented as independent. The lack of independence severely compromises the ability to develop accurate response prediction models.

Another alternative approach to using a designed experiment is to conduct a simulation study using Monte Carlo methods. In this approach a random variation data set is first created for each process parameter; the random setpoint combinations are then input into a mean performance model to generate a predicted response data set. The final step involves statistically characterizing the response data set distribution and using that result to define the process robustness, again at the defined parameter setpoints. This is an extremely computationally intensive approach, and suffers the same two limitations presented above for the statistically designed process robustness experiment.

Current Practice—Sequential Experimentation

As the discussion above points out, both the statistically designed process robustness experiment and the Monte Carlo simulation approach only define the response robustness at a single set of conditions. Therefore, to meet both the mean performance goal and the process robustness goal simultaneously, a complete statistical robustness experiment must be carried out at each set of process parameter conditions (each experiment run) in the statistical optimization experiment. This would require 187 experiments (17×11). This is almost universally impractical. Therefore, the two experiment goals are normally addressed sequentially.

In the sequential approach the mean performance goal experiment is conducted first, and the response prediction models are used to define the optimum process parameter settings. The process robustness goal experiment is then carried out to define the robustness at the optimum process parameter settings. There is obviously a tremendous limitation to this approach:

When the optimum process parameter settings addressed in the process robustness experiment do not meet the robustness goal, the experimenter must start over. However, time and budget restrictions invariably do not allow for additional iterations of the sequential experiment approach. The result is that most process systems are sub-optimal in terms of robustness, and the consequence is a major cost in terms of significant process output being out of specification Accordingly, what is needed is a system and method to overcome the above-identified issues. The present invention addresses these needs.

SUMMARY OF THE INVENTION

A method, system and computer readable medium are disclosed. The method, system and computer readable medium comprises providing a mathematically linked multi-step process for simultaneously determining operating conditions of a system or process that will result in optimum performance in both mean performance requirements and system robustness requirements. In a method and system in accordance with the present embodiment, the steps can be applied to any data array that contains the two coordinated data elements defined previously (independent variables and response variables), and for which a response prediction model can be derived that relates the two elements. The steps are applied to each row of the data array, and result in a predicted C_(p) response data set. In the preferred embodiment the array is a statistically rigorous designed experiment from which mean performance prediction models can be derived for each response evaluated. The data are applied to each row of the data array.

1. Create the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits.

2. Using the PPOE template generate the joint probability of occurrence matrix (P_(j) matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables.

3. Using the PPOE template generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables.

4. Transform acceptance limits (±AL) defined on a relative scale for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) calculated for the setpoint level settings of the variables in preparation of calculating the C_(p).

5. For each element in the Y_(pred) matrix, center the response prediction model error distribution about the mean predicted value and calculate the proportional amount of the distribution that is outside the response acceptance limits.

6. Transform the data distribution of the φ matrix values from a uniform distribution to a Gaussian distribution. This is done by weighting each of the φi values by its corresponding joint probability of occurrence.

7. Calculate the failure rate at setpoint (FRSP). The FRSP is the proportion of the predicted response distribution that is outside the acceptance limits.

8. Calculate the standard deviation of the predicted response distribution using the FRSP calculated in step 7 above.

9. Calculate the confidence interval of the predicted response distribution (CI_(prd)).

10. Calculate the C_(p) for a given row in the data array using the CI_(prd) obtained in step 9 and the actual response acceptance limits obtained in step 4.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a graph showing expectation of variation around the setpoint.

FIG. 2 presents the LCL and UCL values around a % organic setpoint of 80%.

FIG. 3 presents a software dialog that enables input of the expectation of the maximum variation around setpoint over time under normal operation.

FIG. 4 is a graph showing estimated variation around mean response.

FIG. 5 presents a software dialog that enables input of the confidence interval on which the robustness estimate is to be based for each response.

FIG. 6 presents confidence limits and acceptance limits for the % API response.

FIG. 7 shows the C_(p) calculations for the % API response.

FIG. 8A is a typical system which implements the software in accordance with the embodiment.

FIG. 8B is a flow chart illustrating the operation of a method in accordance with the present invention.

FIG. 8C is a graph which shows normal probability distribution around % organic set point.

FIG. 9 presents the joint probability of obtaining any % organic value within its ±σ limits in combination with a pump flow rate of 0.85 mL/min. (its −3σ value).

FIG. 10 presents the extension of the joint probability graph in FIG. 9 to all combinations of the % organic variable with all combinations of the pump flow rate variable.

FIG. 11 is a response surface graph of the completed Y_(pred) matrix values.

FIG. 12 is a graph of the transformed ±10% USP-Res acceptance limits about a predicted mean resolution of 2.01.

FIG. 13 presents Φi for the HPLC method development experiment example.

FIG. 14 is a graph which shows the predicted response distribution associated with the independent variable setpoint values for pump flow rate and % organic of 1.00 mL/min. and 80.0% respectively.

FIG. 15 is a graph which shows the acceptance limits and the CI_(prd) limits bracketing the predicted response distribution associated with the HPLC method development experiment example using the USP-Res response prediction model and values of 1.00 mL/min and 80.0% for the experiment variables pump flow Rate and % organic, respectively.

FIG. 16 is a 3D response surface graph that presents the change in the USP-Res response (Z-axis) as the experiment variables pump flow rate (X-axis) and % organic (Y-axis) are simultaneously varied within their respective experiment ranges.

FIG. 17 is a 3D response surface graph that presents the change in the USP-Res response (Z-axis) as the experiment variables pump flow rate (X-axis) and % organic (Y-axis) are simultaneously varied within their respective experiment ranges.

FIG. 18 is an overlay graph which presents a graphical solution in terms of the variable level setting combination that will simultaneously meet or exceed both defined response goals.

DETAILED DESCRIPTION

The present embodiment relates generally to research, development and engineering studies and more particularly to providing optimum requirements for a designed experiment for such studies. The following description is presented to enable one of ordinary skill in the art to make and use the embodiment and is provided in the context of a patent application and its requirements. Various modifications to the preferred embodiments and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the present embodiment is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features described herein.

DEFINITIONS

This discussion employs as illustrating examples the study factors controlling the solvent blend (mobile phase) used in a high-performance liquid chromatograph (HPLC). The experiment variables are the concentration of the organic solvent in the blend (% organic) and the rate of flow of the solvent blend through the HPLC system (pump flow rate). The response variable is the measured percent of an active pharmaceutical ingredient in a sample (% API).

1. Experiment Variable (X_(i)).

This is also referred to an independent variable or a study factor. It is a settable parameter of the system or process under study.

2. Mean Experiment Variable Level (x_(i)).

Each experiment variable, designated X_(i), will have a level setting, designated x_(i), in each experiment design run. For each variable the x_(i) level setting in the design run is defined as the setpoint (mean value) of the variable X_(i) distribution. The setpoint and distribution are pictured in FIG. 1A. The variable X_(i) distribution is defined below.

3. Variable X_(i) Error Distribution

The X_(i) error distribution is the distribution of experiment variable level settings around the setpoint (x_(i)) due to random variation during normal operation (random error).

-   -   Lower control limit=LCL     -   Upper control limit=UCL

The LCL and UCL are the lower and upper bounds, respectively on the X_(i) error distribution. The LCL and UCL values around a given setpoint level setting for each experiment variable are based on the expectation of the maximum variation around the setpoint over time under normal operation. In this embodiment the values are based on the ±3σ a confidence interval of a normally-distributed error. In other embodiments they could be based on other confidence intervals (e.g. ±1σ, ±2σ, . . . , ±6σ). FIG. 2 presents the LCL and UCL values around a % organic setpoint of 80%.

The preferred embodiment assumes that variation about the x_(i) level settings (setpoints) of the experiment variables is due to random error alone—all errors are normal Gaussian, independent and identically distributed (IID). However, the IID assumption is not required. The variation of each x_(i) can be independently estimated from experimental data, historical records, etc., and the characterized variation distribution can then be the basis of the control limits defined for variable X_(i).

FIG. 3 presents a software dialog that enables input of the expectation of the maximum variation around setpoint over time under normal operation. In this dialog the input for each variable is the ±3σ confidence interval associated with the expected error distribution. Using the expectation of error illustrated in FIG. 2, the ±3σ confidence interval that would be input for % organic would be ±3.00 (UCL minus LCL).

1. Parameter-Based Propagation-of-Error Matrix Template (PPOE Template)

Given the x_(i) setpoints of the experiment variables in a given design run (x_(i1), x_(i2), . . . , x_(in)), the PPOE template is an n-dimensional array template of all possible level-setting combinations of the variables in which the level settings of each variable used in combination are the 0.5σ increment values along the −3σ to +3σ setting interval about the x_(i) setpoint, as defined by the ±3σ confidence interval.

For example, given the two variables pump flow rate and % organic, with setpoint levels and ±3σ confidence intervals of 1.00±0.15 mL/min and 80±3.0%, respectively, the PPOE template of all possible level setting combinations would yield the 2-dimensional array template presented in Table 5. Note that this is not limited to two dimensions, and an array can be generated for any number of experiment variables.

TABLE 5 Parameter-based propagation of error matrix template (PPOE template) % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 Flow 0.88 Rate 0.90 0.93 0.95 0.98 1.00 1.03 1.05 1.08 1.10 1.13 1.15

In the preferred embodiment the PPOE template is generated using the balanced mixed level design type. This design type allows process robustness estimation to be based on the same data construct in all cases for all independent variables. The large size of this design is not a problem, nor is any lack of ability to achieve any of the variable level setting combinations in the template, since the matrix is a “virtual design” used for simulation calculations and will not actually be carried out.

In the preferred embodiment the structure of the PPOE template is based on the previously stated assumption that the variations about the x_(i) level settings (setpoints) of the experiment variables are normal Gaussian errors, independent and identically distributed (IID). However, the assumption is not required. As stated, a characterized variation distribution of each x_(i) can be the basis of the PPOE template structure.

2. Response Variable (Y_(i))

This is also referred to as a dependent variable. This is a quality attribute or performance characteristic measured on the final process output or at a point in the process.

3. Response Variable Target (T_(i))

This is the desired value of the response variable.

4. Mean Response Variable Level (y_(i))

Each response, designated Y_(i), will have a level setting, designated y_(i), associated with each design run. For each response the y_(i) level is defined as the “mean” value of the Y_(i) distribution. The mean response and response distribution are pictured in FIG. 4. The response Y_(i) distribution is defined below.

5. Response Y_(i) Distribution

This is the distribution of response values about the mean response (y_(i)) due to the random variations in the X_(i) variables occurring during normal operation.

-   -   Lower confidence limit=LCL     -   Upper confidence limit=UCL

These values are defined by the prediction confidence interval of the response Y_(i) distribution on which the process robustness estimate is to be based. FIG. 4 presents these bounds obtained when the response Y_(i) distribution is located about a % API target of 100% using a ±3σ confidence interval of ±1.50.

-   -   Lower acceptance limit=LAL     -   Upper acceptance limit=UAL

These are the minimum and maximum acceptable values of a given response variable (Y_(i)). In normal operation process output that falls outside these limits would be rejected.

Note: these values are also referred to as specifications limits and tolerance limits.

FIG. 5 presents a software dialog that enables input of the confidence interval on which the robustness estimate is to be based for each response. The dialog also enables input of the associated LAL, UAL, and T_(i) values for each response. The confidence limits and acceptance limits are presented in FIG. 6 for the % API response based on a T_(i) of 100%, a ±3σ confidence interval of ±1.50, and acceptance limits of ±2.00%.

1. Process Capability (C_(p))

This is a direct, quantitative measure of process robustness used routinely in statistical process control (SPC) applications. The classical SPC definition of “inherent process capability” (C_(p)) is:

$C_{p} = \frac{{UTL} - {LTL}}{6\sigma \mspace{14mu} {variation}}$

where UTL and LTL=tolerance (product specification) limits, and 6σ variation=±3σ process output variation. In this embodiment we simply substitute the nomenclature acceptance limits for tolerance limits. The traditional goal for C_(p) is described below.

C_(p) goal≧1.33 (standard goal based on setting the UTL and LTL at ±4σ of process output variation).

When C_(p) is below 1.00, probability of Type I error (α) becomes significant.

α=P(type I error)=P(reject H ₀ |H ₀ is true)

β=P(type II error)=P(fail to reject H ₀ |H ₀ is false)

FIG. 7 shows the C_(p) calculations for the % API response based on a T_(i) of 100%, a ±3σ confidence interval of ±1.50, and acceptance limits of ±2.00%.

Embodiment

This document describes a new embodiment in the form of a methodology that overcomes (1) the statistical deficiencies associated with historical data sets, and (2) the common failing of the sequential experimental approach without requiring the extremely large amount of work that would otherwise be required. The methodology provides the ability to optimize study factors in terms of both mean performance and process robustness in a single experiment that requires no additional experiment runs beyond the number required of a typical statistically designed mean response experiment.

The methodology includes computational approaches that are statistically superior to, and more defensible than, traditional approaches. Relative to traditional simulation approaches, the methodology is tremendously more computationally efficient overall, and eliminates the need for the computational steps required to characterize predicted response data distributions.

The methodology contains three key elements: up-front definition of performance-driven robustness criteria, a rigorous and defensible experiment design strategy, and statistically valid data analysis metrics for both mean performance and process robustness. The methodology includes the 10-step process providing this information.

FIG. 8A is a typical system 10 which implements the software in accordance with the embodiment. The system 10 includes an embodiment 12 coupled to a processing system 14.

In one embodiment, the methodology is implemented in the form of computer software that automates research, development, and engineering experiments. This document describes a typical embodiment in the application context of developing a high-performance liquid chromatography (HPLC) analytical method. In this embodiment the software that executes the invention is integrated within software that automates HPLC method development experiments.

FDA and ICH guidance documents addressing chromatographic analytical method development state that a best practices approach to method development incorporates robustness into the method development process. This is a qualitative guidance statement—as no standard experimental methodology to accomplish this “best practices” approach is available, none is referenced in the guidance documents. In the embodiment a best practices approach is automated that conform to regulatory guidances through statistically rigorous and defensible analytical method development experimentation that incorporates robustness using quantitative process capability based metrics. To describe the features of the present invention in more detail, refer now to the following description in conjunction with the accompanying figures.

FIG. 8B is a flow chart illustrating the operation of a method in accordance with the present invention. First, provide a first set of data, via step 102. Next, provides a multi-step mathematical process for simultaneously determining operating conditions of a system or process that will result in optimum performance for both mean performance requirements and robustness requirements, via step 104.

Procedure

Below is a summary of the calculation steps associated with the multi-step process described above. The steps can be applied to any data array that contains the two coordinated data elements defined previously (independent variables and response variables), and for which a response prediction model can be derived that relates the two elements. The steps are applied to each row of the data array, and result in a predicted C_(p) response data set. In the preferred embodiment the array is a statistically rigorous designed experiment from which mean performance prediction models can be derived for each response evaluated.

1. Create the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits.

2. Using the PPOE template generate the joint probability of occurrence matrix (P_(j) matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables.

3. Using the PPOE template generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables.

4. Transform acceptance limits (±AL) defined on a relative scale for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) calculated for the setpoint level settings of the variables in preparation of calculating the C_(p).

5. For each element in the Y_(pred) matrix, center the response prediction model error distribution about the mean predicted value and calculate the proportional amount of the distribution that is outside the response acceptance limits.

6. Transform the data distribution of the φ matrix values from a uniform distribution to a Gaussian distribution. This is done by weighting each of the φi values by its corresponding joint probability of occurrence.

7. Calculate the failure rate at setpoint (FRSP). The FRSP is the proportion of the predicted response distribution that is outside the acceptance limits.

8. Calculate the standard deviation of the predicted response distribution using the FRSP calculated in Step 7 above.

9. Calculate the confidence interval of the predicted response distribution (CI_(prd)).

10. Calculate the C_(p) for a given row in the data array using the CI_(prd) obtained in Step 9 and the actual response acceptance limits obtained in Step 4.

The following pages contain a detailed presentation of the calculations associated with an embodiment. The detailed presentation employs an example analytical method development study involving a high-performance liquid chromatograph (HPLC) in an embodiment. The experiment variables in the study are pump flow rate and % organic, with mean variable levels and ±3σ control limits in a given design run of 1.00±0.15 mL/min. and 80±3.0%, respectively. The response variable is the measured USP resolution (USP-Res) between an active pharmaceutical ingredient and a degradant. The experiment design matrix used in the study is presented below.

Example Design-Response Matrix Run No. Pump Flow Rate % Organic 1 1 92.5 2 1 67.5 3 0.5 92.5 4 0.75 73.75 5 1.5 67.5 6 1.25 86.25 7 1.5 92.5 8 1.5 67.5 9 0.5 80 10 1 80 11 1.5 92.5 12 0.5 67.5 13 1.25 73.75 14 0.5 80 15 1.5 80

Step 1. Create the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits.

An example dialog for user input of experiment variable control limits is presented in FIG. 3.

As an example, given two experiment variables pump flow rate and % organic, with mean variable levels and ±3σ control limits in a given design run of 1.00±0.15 mL/min. and 80±3.0%, respectively, the PPOE template of all possible level setting combinations would yield the 2-dimensional array template presented in Table 6.

TABLE 6 PPOE template % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 Flow 0.88 Rate 0.90 0.93 0.95 0.98 1.00 1.03 1.05 1.08 1.10 1.13 1.15

To illustrate, consider the % organic variable having a mean level x_(i) of 80.0% and ±3σ control limits of ±3.0% in a given design run. Applying a normal probability distribution to the variable with distribution parameters X=80% and ±3σ limits=±3.0% yields the graph shown in FIG. 8C. Notice that the % organic levels of 77.0% and 83.0% are at the −3.0σ and +3.0σ limits, respectively.

Step 2. Using the PPOE template generate the joint probability of occurrence matrix (P_(j) matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables.

Calculate the joint probability of occurrence (P_(j)) of each level setting combination of the experiment variables in the PPOE template. When the error distributions associated with the variables are independently distributed, P_(j) is the simple product of the individual probabilities associated with the level of each variable in a given combination. The individual probability of occurrence (P_(i, i=1, 2, . . . , n)) associated with a given experiment variable level setting in the PPOE template, the variation in which follows a Gaussian (normal) distribution, is defined as:

$P_{i} = \frac{0.5*^{\frac{- {(\frac{({x - \overset{\_}{X}})}{\sigma})}^{2}}{2}}}{\sqrt{2*\pi}}$

To illustrate using the HPLC method development experiment example, the P₁ associated with a pump flow rate of 0.85 mL/min (with 1σ=±0.05) is:

$P_{1} = {\frac{0.5*^{\frac{- {(\frac{({0.85 - 100})}{0.05})}^{2}}{2}}}{\sqrt{2*\pi}} = 0.002216}$

P₂ associated with a % organic of 77% (with 1σ=±1.00) is:

$P_{2} = {\frac{0.5*^{\frac{- {(\frac{({77.0 - 80.0})}{1.0})}^{2}}{2}}}{\sqrt{2*\pi}} = 0.002216}$

P_(j) is the simple product of the individual probabilities (P₁ and P₂), which is:

P _(j) =P ₁ *P ₂=0.002216*0.002216=0.000005

Generating the joint probability of occurrence for each element of the PPOE template in the manner described above yields the P_(j) matrix presented in Table 7.

TABLE 7 Joint probability matrix (P_(j) matrix) % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 Pump 0.85 0.000005 0.000019 0.000060 0.000144 0.000268 0.000390 0.000442 Flow 0.88 0.000019 0.000077 0.000237 0.000568 0.001060 0.001543 0.001748 Rate 0.90 0.000060 0.000237 0.000729 0.001748 0.003266 0.004752 0.005385 0.93 0.000144 0.000568 0.001748 0.004194 0.007835 0.011400 0.012918 0.95 0.000268 0.001060 0.003266 0.007835 0.014637 0.021297 0.024133 0.98 0.000390 0.001543 0.004752 0.011400 0.021297 0.030987 0.035113 1.00 0.000442 0.001748 0.005385 0.012918 0.024133 0.035113 0.039789 1.03 0.000390 0.001543 0.004752 0.011400 0.021297 0.030987 0.035113 1.05 0.000268 0.001060 0.003266 0.007835 0.014637 0.021297 0.024133 1.08 0.000144 0.000568 0.001748 0.004194 0.007835 0.011400 0.012918 1.10 0000060 0.000237 0.000729 0.001748 0.003266 0.004752 0.005385 1.13 0.000019 0.000077 0.000237 0.000568 0.001060 0.001543 0.001748 1.15 0.000005 0.000019 0.000060 0.000144 0.000268 0.000390 0.000442 % Organic 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 0.000390 0.000268 0.000144 0.000060 0.000019 0.000005 Flow 0.88 0.001543 0.001060 0.000568 0.000237 0.000077 0.000019 Rate 0.90 0.004752 0.003266 0.001748 0.000729 0.000237 0.000060 0.93 0.011400 0.007835 0.004194 0.001748 0.000568 0.000144 0.95 0.021297 0.014637 0.007835 0.003266 0.001060 0.000268 0.98 0.030987 0.021297 0.011400 0.004752 0.001543 0.000390 1.00 0.035113 0.024133 0.012918 0.005385 0.001748 0.000442 1.03 0.030987 0.021297 0.011400 0.004752 0.001543 0.000390 1.05 0.021297 0.014637 0.007835 0.003266 0.001060 0.000268 1.08 0.011400 0.007835 0.004194 0.001748 0.000568 0.000144 1.10 0.004752 0.003266 0.001748 0.000729 0.000237 0.000060 1.13 0.001543 0.001060 0.000568 0.000237 0.000077 0.000019 1.15 0.000390 0.000268 0.000144 0.000060 0.000019 0.000005

Summing all the joint probabilities within the P_(j) matrix will always yield a value of 1.000. This will be true for any dimensionality of the PPOE template (any number of experiment variables in the design) and for any response prediction model (any model form and number of model parameters).

FIG. 9 presents the joint probability of obtaining any % organic value within its ±3σ limits in combination with a pump flow rate of 0.85 mL/min. (its −3σ value). FIG. 9 therefore represents a graph of the data in the first row of the P_(j) matrix. The graph is obtained by multiplying the probability of occurrence associated with each % organic value in the PPOE template by the individual probability of occurrence value of 0.002216 associated with a pump flow rate of 0.85 mL/min.

FIG. 10 presents the extension of the joint probability graph in FIG. 9 to all combinations of the % organic variable with all combinations of the pump flow rate variable. It therefore represents a graph of all the data in the P_(j) matrix.

Step 3. Using the PPOE template generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables.

In the preferred embodiment each Y_(pred) value is calculated directly using an equation (prediction model) obtained from regression analysis of the experiment data. For example, given an HPLC method development experiment that contained the variables pump flow rate and % organic, and a USP resolution response (USP-Res) calculated for each peak identified in each experiment run, a prediction model relating the effects of the variables to the response could be obtained from linear regression analysis of the experiment data set of the form:

USP-Res=−5.8777185−1.848083*Pump Flow Rate+0.155887*% Organic+0.8104334*(Pump Flow Rate)̂2−0.0004528*(% Organic)̂2−0.0081393*Pump Flow Rate*% Organic

Given a pump Flow Rate of 0.85 mL/min. and a % organic of 77.0%, the predicted USP-Res value would be calculated from the model as:

USP-Res=−5.8777185−1.848083*(0.85)+0.155887*(77.0)+0.8104334*(0.85)̂2−0.0004528*(77.0)̂2−0.0081393*0.85*77.0=1.92

Using this prediction model to calculate a Y_(pred) value of USP-Res for each element in the PPOE template array yields the array of predicted values shown in Table 8. The corresponding response surface graph of the completed Y_(pred) matrix values is shown in FIG. 11.

TABLE 8 Y_(pred) matrix % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 1.92 1.96 2.00 2.04 2.08 2.12 2.16 2.19 2.23 2.27 2.31 2.34 2.38 Flpw 0.88 1.90 1.94 1.97 2.01 2.05 2.09 2.13 2.17 2.20 2.24 2.28 2.32 2.35 Rate 0.90 1.87 1.91 1.95 1.99 2.03 2.06 2.10 2.14 2.18 2.22 2.25 2.29 2.33 0.93 1.85 1.88 1.92 1.96 2.00 2.04 2.08 2.11 2.15 2.19 2.23 2.26 2.30 0.95 1.82 1.86 1.90 1.94 1.98 2.01 2.05 2.09 2.13 2.17 2.20 2.24 2.28 0.98 1.80 1.84 1.88 1.91 1.95 1.99 2.03 2.07 2.10 2.14 2.18 2.21 2.25 1.00 1.78 1.82 1.85 1.89 1.93 1.97 2.01 2.04 2.08 2.12 2.16 2.19 2.23 1.03 1.76 1.79 1.83 1.87 1.91 1.95 1.99 2.02 2.06 2.10 2.13 2.17 2.21 1.05 1.74 1.77 1.81 1.85 1.89 1.93 1.96 2.00 2.04 2.08 2.11 2.15 2.19 1.08 1.72 1.76 1.79 1.83 1.87 1.91 1.95 1.98 2.02 2.06 2.09 2.13 2.17 1.10 1.70 1.74 1.78 1.81 1.85 1.89 1.93 1.96 2.00 2.04 2.07 2.11 2.15 1.13 1.68 1.72 1.76 1.80 1.83 1.87 1.91 1.95 1.98 2.02 2.06 2.09 2.13 1.15 1.67 1.70 1.74 1.78 1.82 1.86 1.89 1.93 1.97 2.00 2.04 2.08 2.11

Step 4. Transform acceptance limits (±AL) defined on a relative scale for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) calculated for the setpoint level settings of the variables in preparation of calculating the C_(p).

An example dialog for user input of response confidence intervals and acceptance limits is presented in FIG. 5. Acceptance limits can be either units of percent (relative) or units of the response (absolute). In the dialog presented in FIG. 5, the limits are entered in absolute response units. In this discussion of the HPLC method development experiment example, the inputs for the USP-Res response are in relative units of percent. Note that lower and upper limits are entered separately, and are not required to be symmetrical about a target mean response value.

To illustrate using acceptance limits defined as a relative percent, the LAL and UAL about a setpoint are defined as:

LAL=Y _(pred) −AL*Y _(pred)

UAL=Y _(pred) +AL*Y _(pred)

Using the HPLC method development experiment example, given a predicted resolution of 2.01 obtained from the USP-Res model for 80% organic and 1.00 mL/min pump flow rate and user defined acceptance limits of +10%, the transformed LAL and UAL are:

LAL=2.01−0.10*2.00=1.81

UAL=2.01+0.10*2.00=2.21

FIG. 12 presents a graph of the transformed ±10% USP-Res acceptance limits about a predicted mean resolution of 2.01.

Step 5. For each element in the Y_(pred) matrix, center the response prediction model error (σ_(pe) ²) distribution about the mean predicted value and calculate the proportional amount of the distribution that is outside the response acceptance limits.

The proportional amount of the response prediction model error distribution that is outside of the response acceptance limits is referred to as the cumulative probability. For a given element of the Y_(pred) matrix, designated i, the cumulative probability, designated φi, is computed by first integrating the error distribution from −∞ to LAL (designated φ_(LAL)) and from UAL to +∞ (designated φ_(UAL)). The φi value is then obtained as the simple sum of the cumulative probabilities computed for φ_(LAL) and φ_(UAL). The cumulative probability formulas for φ_(LAL) and φ_(UAL) are:

$\varphi_{LAL} = {\frac{1}{\sqrt{2\pi}}{\int_{- \infty}^{LAL}{^{\frac{- {(\frac{t - Y_{pred}}{\sigma})}^{2}}{2}}{t}}}}$ $\varphi_{UAL} = {\frac{1}{\sqrt{2\pi}}{\int_{UAL}^{\infty}{^{\frac{- {(\frac{t - Y_{pred}}{2})}^{2}}{2}}{t}}}}$

Continuing with the HPLC method development experiment example, given user defined AL values of ±10% and a USP-Res model error standard deviation of ±0.0107(σ_(pe) ²=0.000115), the φ_(LAL) and φ_(UAL) values obtained for the completed Y_(pred) matrix element of 1.92 are:

$\varphi_{LAL} = {{\frac{1}{\sqrt{2\pi}}{\int_{- \infty}^{1.81}{^{\frac{- {(\frac{t - 1.92}{0.0107})}^{2}}{2}}{t}}}} \approx 0}$ $\varphi_{UAL} = {{\frac{1}{\sqrt{2\pi}}{\int_{2.21}^{\infty}{^{\frac{- {(\frac{t - 1.92}{0.0107})}^{2}}{2}}{t}}}} \approx 0}$

The corresponding φ_(i) is then the simple sum of the φ_(LAL) and φ_(UAL), or approximately 0. FIG. 13 presents φi for the HPLC method development experiment example.

Generating φi for each element of the Y_(pred) matrix in the manner described above yields the cumulative probability matrix (φ Matrix) presented in Table 9. Note that the φi value computed for any given element in the φ matrix can range from zero to one (0-1), depending on the proportion of the error distribution that is outside the acceptance limits. Note also that the data distribution associated with the φ matrix values is a uniform distribution, as the PPOE template coordinates associated with each data value are represented as equally likely to occur in the φ matrix. In the next step, each of the φi values will be weighted by its corresponding joint probability of occurrence. This will transform the φ matrix data distribution from a uniform distribution to a Gaussian distribution.

TABLE 9 Cumulative probability matrix (φ matrix) % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.305 0.992 1.000 1.000 1.000 1.000 Flow 0.88 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.009 0.675 0.997 1.000 1.000 1.000 Rate 0.90 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.040 0.912 0.999 1.000 1.000 0.93 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.182 0.979 1.000 1.000 0.95 0.039 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.008 0.575 0.995 1.000 0.98 0.560 0.005 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.044 0.894 0.999 1.00 0.970 0.092 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.003 0.242 0.980 1.03 0.998 0.693 0.010 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.016 .709 1.05 1.000 0.975 0.129 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.112 1.08 1.000 0.997 0.706 0.011 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.009 1.10 1.000 1.000 0.970 0.115 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 1.13 1.000 1.000 0.996 0.610 0.008 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.15 1.000 1.000 0.999 0.943 0.064 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000

6. Transform the data distribution of the φ matrix values from a uniform distribution to a Gaussian distribution. This is done by weighting each of the φi values by its corresponding joint probability of occurrence.

To weight each of the φi values, multiply the value by the corresponding joint probability of occurrence matrix (P_(j) matrix) value (the value with the same PPOE template coordinates), as shown in the following equation.

Φ′i=Φi*P _(j) i

Calculating a φ′i value for each element of the φ matrix in this manner will result in a weighted cumulative probability matrix (Φ′Matrix). Note that the elements of the Φ′Matrix will always sum to some number between zero and one (0-1).

Continuing with the HPLC method development experiment example, the φ′i obtained for the φi value associated with pump flow rate=0.85 mL/min. and % organic 77.0% is:

Φ′i=0*0.000005≅0

Calculating φ′i for each element of the φ matrix results in the Φ′Matrix presented in Table 10.

TABLE 10 Weighted Cumulative Probability Matrix (φ′ Matrix) % Organic 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 Pump 0.85 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 Flow 0.88 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.000 0.000 0.000 Rate 0.90 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.001 0.000 0.000 0.93 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.001 0.000 0.95 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.001 0.000 0.98 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 1.00 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.03 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.05 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.08 0.000 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.10 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.13 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.15 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

7. Calculate the failure rate at setpoint (FRSP) for each row of the data array (the data array is an experiment design matrix in the preferred embodiment). The FRSP is the proportion of the predicted response distribution that is outside the acceptance limits.

Note that the predicted response distribution is the distribution derived from the Y_(pred) matrix for a given setpoint combination of the independent variables. FRSP is calculated as:

${FRSP} = {\underset{1}{\sum\limits^{n}}{\Phi^{\prime}i}}$

Continuing with the HPLC method development experiment example, the FRSP for the Φ′Matrix presented in Table 10 is 0.023. FIG. 14 presents the predicted response distribution associated with the independent variable setpoint values for pump flow rate and % organic of 1.00 mL/min. and 80.0%, respectively. The proportional area of the distribution that is outside of the acceptance limits is shown in red in FIG. 14.

Calculate the standard deviation of the predicted response distribution using the FRSP calculated in step 7 above.

The standard deviation, designated σ_(prd), is the value for which

${FRSP} = {{\frac{1}{\sqrt{2\pi}}{\int_{- \infty}^{LAL}{^{\frac{- {(\frac{t - Y_{pred}}{\sigma_{prd}})}^{2}}{2}}{t}}}} + {\frac{1}{\sqrt{2\pi}}{\int_{UAL}^{\infty}{^{\frac{- {(\frac{t - Y_{pred}}{\sigma_{prd}})}^{2}}{2}}{t}}}}}$

This equation can be alternatively expressed as

${FRSP} = {1 - {\frac{1}{2}\left( {{{erf}\left( \frac{{LAL} - Y_{pred}}{\sigma_{prd}\sqrt{2}} \right)} - {{erf}\left( \frac{{UAL} - Y_{pred}}{\sigma_{prd}\sqrt{2}} \right)}} \right)}}$

where erf is the error function.

σ_(prd) is obtained by iteratively solving the function

${g\left( \sigma_{prd} \right)} = {1 - {FRSP} - {\frac{1}{2}\left( {{{erf}\left( \frac{{LAL} - Y_{pred}}{\sigma_{prd}\sqrt{2}} \right)} - {{erf}\left( \frac{{UAL} - Y_{pred}}{\sigma_{prd}\sqrt{2}} \right)}} \right)}}$

Note that the FRSP, and the approach to calculating σ_(prd) presented in this step, are independent of the form of the response prediction model. This enables the use of nonlinear response prediction models to be used in steps 1-7. This also enables the use of linear response prediction models based on response data that have been transformed prior to model development when the associated error distribution is not Gaussian.

Continuing with the HPLC method development experiment example, the σ_(prd) of the predicted response distribution for the pump flow rate and % organic setpoint level settings of 1.00 mL/min. and 80.0%, respectively is 0.088.

9. Calculate the confidence interval of the predicted response distribution (CI_(prd)).

The CI_(prd) of the predicted response distribution is obtained by multiplying the absolute standard deviation obtained in step 8 (σ_(prd)) by the required confidence interval on which the C_(p) estimate is to be based (CI_(Req)). The equation for computing CI_(prd) is presented below.

CI _(prd)=2*Response Confidence Interval(CI _(Req))*σ_(prd)

Continuing with the HPLC method development experiment example, using the values 3 for CI_(Req) and 0.088 for σ_(prd), the equation above provides a six sigma CI_(prd) of 0.528, as shown below.

CI _(prd)=2*3*0.088=0.528

10. Calculate the C_(p) for a given row in the data array using the CI_(prd) obtained in step 9 and the actual response acceptance limits obtained in step 4.

The equation for computing C_(p) using these data is presented below.

$C_{p} = \frac{{UAL} - {LAL}}{{CI}_{prd}}$

Continuing with the HPLC method development experiment example, using the values 2.21. for UAL, 1.81 for LAL, and 0.528 for CI_(prd), the equation above provides a C_(p) of 0.761, as shown below.

$C_{p} = {\frac{2.21 - 1.81}{0.528} = 0.761}$

FIG. 15 presents the acceptance limits and the CI_(prd) limits bracketing the predicted response distribution associated with the HPLC method development experiment example using the USP-Res response prediction model and values of 1.00 mL/min and 80.0% for the experiment variables pump flow rate and % organic, respectively.

The 10-step process just described results in a process C_(p) value for a given row of a data array (a single run of a statistically designed experiment in the preferred embodiment). Carrying out the 10-step process for each row of the array results in a coordinated C_(p) response data set associated with each original response data set used in the process (response-C_(p)).

Continuing with the HPLC method development experiment example, the statistical experiment design to optimize mean performance that includes the experiment variables pump flow rate and % organic is again presented in Table 11, which now includes the USP-Res response. Carrying out the 10-step process for each run in the design matrix using the associated USP-Res response data results in the coordinated USP-Res-C_(p) response data set. This data set is also shown in the example design-response matrix presented in Table 11. FIG. 16 is a 3D response surface graph that presents the change in the USP-Res response (Z-axis) as the experiment variables pump flow rate (X-axis) and % organic (Y-axis) are simultaneously varied within their respective experiment ranges.

TABLE 11 Example design-response matrix Run No. Pump Flow Rate % Organic USP-Res USP-Res-C_(p) 1 1 92.5 2.87531714 1.209859778 2 1 67.5 1.009507452 0.346458789 3 0.5 92.5 3.574697962 1.110866454 4 0.75 73.75 1.633095437 0.568845149 5 1.5 67.5 0.806259453 0.271522437 6 1.25 86.25 2.282069253 1.06465258 7 1.5 92.5 2.585802793 2.120678559 8 1.5 67.5 0.805211325 0.271522437 9 0.5 80 2.644718265 0.775215823 10 1 80 2.00652785 0.761446068 11 1.5 92.5 2.585587125 2.120678559 12 0.5 67.5 1.58544945 0.423087453 13 1.25 73.75 1.339984865 0.551064866 14 0.5 80 2.644637508 0.775215823 15 1.5 80 1.79050218 0.813562998

Numerical analysis of the example design-response matrix using the USP-Res response data set yields the response prediction model presented below. FIG. 17 is a 3D response surface graph that presents the change in the USP-Res response (Z-axis) as the experiment variables pump flow rate (X-axis) and % organic (Y-axis) are simultaneously varied within their respective experiment ranges. The graph was generated from predicted USP-Res response data obtained using the USP-Res response prediction model.

Response Prediction Model—USP-Res Response

USP-Res=−5.8777185−1.848083*Pump Flow Rate+0.155887*% Organic+0.8104334*(Pump Flow Rate)̂2−0.0004528*(% Organic)̂2−0.0081393*Pump Flow Rate*% Organic

Numerical analysis of the example design-response matrix using the USP-Res-C_(p) response data set yields the response prediction model presented below. FIG. 17 is a coordinated 3D response surface graph that presents the change in the USP-Res-C_(p) response (Z-axis) using the same X-axis and Y-axis variables and ranges. The graph was generated from predicted USP-Res-C_(p) response data obtained using the USP-Res-C_(p) response prediction model.

Response Prediction Model—USP-Res-C_(p) Response

USP-Res C _(p)=exp(−8.7449734+3.702804*Pump Flow Rate+0.165489*% Organic−3.468210*(Pump Flow Rate)̂2−0.00070679*(% Organic)̂2−0.053370*Pump Flow Rate*% Organic+0.0474979*% Organic*(Pump Flow Rate)̂2)−0.02121

The two response prediction models can be linked to a numerical optimizer that searches defined ranges of the independent variables (process parameters) to identify one or more combinations of variable level settings that meet or exceed goals defined for each response.

Continuing with the example method development experiment, using the experimental ranges of the independent variables (pump flow rate and % organic), the response equations presented above, and response goals of ≧2.00 for the USP-Res response and ≧1.33 for the USP-Res-C_(p) response, the Hooke and Jeeves optimizer algorithm provides the solution presented in Table 12 in terms of the variable level setting combination that will simultaneously meet or exceed both defined response goals.

TABLE 12 Hooke and Jeeves numerical optimizer solution Optimum level settings: pump flow rate = 1.5, % organic = 92.5 Optimizer Response Answer −2 Sigma +2 Sigma Variable Predicted Confidence Confidence Name Target Response Limit Limit API-USP- Maximize 2.58955407843 2.54651638883 2.63259176804 Res API-USP- Maximize 2.08814562740 1.87838377199 2.32107038799 Res-Cp

It should be understood that (1) this optimization process can accommodate any number of independent variables and any number of response variables, and (2) other optimizer algorithms can also be used such as the Solver algorithm that is part of the Microsoft® Excel software toolset. It should also be understood that different optimizers, both numerical and graphical, differ in their capabilities in terms of types of goals that can be defined. For example, in some optimizers goals such as maximize—with a lower bound, minimize—with an upper bound, and target—with lower and upper bounds can be defined. Also, some optimizers provide an ability to rank the relative importance of each goal. The two response prediction models can also be linked to a graphical optimizer that presents range combinations of the independent variables (process parameters) that meet or exceed defined goals for each response.

Continuing with the example method development experiment, using the experimental ranges of the independent variables (pump flow rate and % organic), the response equations presented above, and response goals of ≧2.00 for the USP-Res response and ≧1.33 for the USP-Res-C_(p) response, the graphical optimizer provides a graphical solution in terms of the variable level setting combination that will simultaneously meet or exceed both defined response goals. The graphical solution in this case is in the form of the overlay graph presented in FIG. 18. In this graph each response is assigned a color; for a given response the area of the graph that is shaded with the assigned color corresponds to variable level setting combinations that do not meet the defined response goal. The unshaded region of the graph therefore corresponds to variable level setting combinations that simultaneously meet all defined response goals.

Although the present invention has been described in accordance with the embodiments shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims. 

1. A computer-implemented method comprising: providing a multi-step mathematical process for simultaneously determining operating conditions of a system or process that will result in optimum performance for both mean performance requirements and robustness requirements.
 2. The computer-implemented method of claim 1 wherein the multi-step process utilizes a first set of system or process data that includes independent variables to generate a second set of data used to optimize performance for both mean performance requirements and robustness requirements.
 3. The computer-implemented method of claim 2 wherein the first set of data will be derived from a rigorous statistical experiment design and/or historical data.
 4. The computer-implemented method of claim 3 including utilizing the first set of data to derive mean performance prediction models.
 5. The computer-implemented method of claim 4 wherein the multi-step process includes: creating the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits; using the PPOE template to generate the joint probability of occurrence matrix (P matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables; using the PPOE template to generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables; transforming acceptance limits (±AL) for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) value calculated for the setpoint level settings of the variables in preparation of calculating the response robustness (C_(p)); for each element in the Y_(pred) matrix, centering the response prediction model error distribution about the mean predicted value and calculating the proportional amount of the distribution that is outside the response acceptance limits resulting in a φ matrix; transforming the data distribution of the matrix values from a uniform distribution to a Gaussian distribution; calculating the failure rate at setpoint (FRSP), wherein the FRSP is the proportion of the predicted response distribution that is outside the acceptance limits; calculating the standard deviation of the predicted response distribution using the FRSP; calculating the confidence interval of the predicted response distribution (CI_(prd)); and calculating the C_(p) for a given row in the first data set using the CI_(prd) obtained and the actual response acceptance limits, resulting in robustness data from which a robustness prediction model is derived.
 6. The computer-implemented method of claim 5 which includes deriving prediction equations from the robustness data.
 7. The computer-implemented method of claim 6, which includes linking the mean performance prediction model and the robustness prediction model to a numerical optimizer that searches defined ranges of the independent variables to identify one or more combinations of variable level settings that meet or exceed goals defined for each response.
 8. The computer-implemented method of claim 6, which includes linking the mean performance prediction model and the robustness prediction model to a graphical optimizer that displays a range of combinations of the independent variables that meet or exceed defined goals for each response.
 9. A computer program product including program instructions for: providing a multi-step mathematical process for simultaneously determining operating conditions of a system or process that will result in optimum performance for both mean performance requirements and robustness requirements.
 10. The computer program product of claim 9 wherein the multi-step process utilizes a first set of system or process data that includes independent variables to generate a second set of data used to optimize performance for both mean performance requirements and robustness requirements.
 11. The computer program product of claim 10 wherein the first set of data will be derived from a rigorous statistical experiment design and/or historical data.
 12. The computer program product of claim 11 includes utilizing the first set of data to derive mean performance prediction models.
 13. The computer program product of claim 12 wherein the multi-step process includes: creating the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits; using the PPOE template to generate the joint probability of occurrence matrix (P_(j) matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables; using the PPOE template to generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables; transforming acceptance limits (±AL) for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) value calculated for the setpoint level settings of the variables in preparation of calculating the response robustness (C_(p)); for each element in the Y_(pred) matrix, centering the response prediction model error distribution about the mean predicted value and calculate the proportional amount of the distribution that is outside the response acceptance limits resulting in a φ matrix; transforming the data distribution of the φ matrix values from a uniform distribution to a Gaussian distribution; calculating the failure rate at setpoint (FRSP), wherein the FRSP is the proportion of the predicted response distribution that is outside the acceptance limits; calculating the standard deviation of the predicted response distribution using the FRSP; calculating the confidence interval of the predicted response distribution (CI_(prd)); and calculating the C_(p) for a given row in the first data set using the CI_(prd) obtained and the actual response acceptance limits, resulting in robustness data from which a robustness prediction model will be derived.
 14. The computer program product of claim 13 which includes deriving prediction equations from the robustness data.
 15. The computer program product of claim 14, which includes linking the mean performance prediction model and the robustness prediction model to a numerical optimizer that searches defined ranges of the independent variables to identify one or more combinations of variable level settings that meet or exceed goals defined for each response.
 16. The computer program product of claim 14, which includes linking the mean performance prediction model and the robustness prediction model to a graphical optimizer that displays a range of combinations of the independent variables that meet or exceed defined goals for each response.
 17. A computer comprising: a processor; and means for providing a multi-step mathematical process for simultaneously determining operating conditions of a system or process that will result in optimum performance for both mean performance requirements and robustness requirements.
 18. The computer of claim 17 wherein the multi-step process utilizes a first set of system or process data that includes independent variables to generate a second set of data used to optimize performance for both mean performance requirements and robustness requirements.
 19. The computer of claim 18 wherein the first set of data will be derived from a rigorous statistical experiment design and/or historical data.
 20. The computer of claim 19 which includes means for utilizing the first set of data to derive mean performance prediction models.
 21. The computer of claim 20 wherein the multi-step process includes: creating the parameter-based propagation-of-error matrix template (PPOE template) based on the experiment variables and their associated control limits; using the PPOE template to generate the joint probability of occurrence matrix (P_(j) matrix) by computing a P_(j) value for each PPOE template combination of the experiment variables; using the PPOE template to generate a response prediction matrix (Y_(pred) matrix) by computing a Y_(pred) value for each PPOE template combination of the experiment variables; transforming acceptance limits (±AL) for the response to actual lower and upper acceptance limits (LAL and UAL) around the Y_(pred) value calculated for the setpoint level settings of the variables in preparation of calculating a response robustness (C_(p)); for each element in the Y_(pred) matrix, centering the response prediction model error distribution about the mean predicted value and calculating the proportional amount of the distribution that is outside the response acceptance limits resulting in a φ matrix; transforming the data distribution of the φ matrix values from a uniform distribution to a Gaussian distribution; calculating the failure rate at setpoint (FRSP), wherein the FRSP is the proportion of the predicted response distribution that is outside the acceptance limits; calculating the standard deviation of the predicted response distribution using the FRSP; calculating the confidence interval of the predicted response distribution (CI_(prd)); and calculating the C_(p) for a given row in the first data set using the CI_(prd) obtained and the actual response acceptance limits, resulting in robustness data from which a robustness prediction model will be derived.
 22. The computer of claim 21 which includes means for deriving prediction equations from the robustness data.
 23. The computer of claim 22, which includes means for linking the mean performance prediction model and the robustness prediction model to a numerical optimizer that searches defined ranges of the independent variables to identify one or more combinations of variable level settings that meet or exceed goals defined for each response.
 24. The computer of claim 22, which includes means for linking the mean performance model and the robustness prediction model to a graphical optimizer that displays a range of combinations of the independent variables that meet or exceed defined goals for each response. 